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1.  INTRODUCTION 


1.1  Limited  Bandwidth  Tactical  Networks.  The  purpose  of  a  network  is  to  serve  as  a  carrier  of 
information  from  one  point  to  another.  On  a  limited  bandwidth  tactical  network,  the  number  of  nodes  and 
the  amount  of  information  to  pass  can  be  lai^e.  especially  during  peak  battle  periods.  The  effective 
distribution  of  information  can  enhance  the  decision  process  on  the  battlefield,  while  the  impact  of  making 
decisions  from  old  information  can  be  catastrophic  (Btodeen,  Kaste,  and  Broome  1992). 

1.2  Network  Effectiveness.  To  measure  a  network's  effectiveness,  one  must  determine  whether  the 
messages  the  network  services  arrive  at  their  destination  correctly  and  in  time  to  be  useful.  The  amount 
of  correctly  passed  information  is  referred  to  as  "network  throughput,"  and  the  amount  of  time  required 
to  pass  that  information  as  "network  delay.”  There  are  a  number  of  parameters  that  can  impact  throughput 
and  delay;  for  example,  the  number  of  messages  to  be  trarrsmitted,  the  size  of  the  messages,  the  number 
of  nodes  on  the  network,  the  communications  protocol,  and  the  communications  hardware.  If  the 
interaction  of  these  network  parameters  is  understood,  the  network’s  effectiveness  can  be  optimized. 

1.3  Experimentation  vs.  Simulation.  One  way  to  examine  the  interaction  of  network  parameters  is 
through  simulation.  But  communications  protocols  are  often  too  complex  to  model  precisely.  The 
simulations  often  take  required  input,  such  as  the  probability  two  or  more  messages  will  collide,  the 
expected  delay  in  message  transmission,  or  the  arrival  rate  of  messages  at  a  given  node,  and  extrapolate 
those  estimates  to  a  large  scenario  of  multiple  nodes.  These  drastic  assumptions,  usually  made  to  simplify 
the  simulation,  may  actually  result  in  an  uitrealistic  representation  of  the  protocol.  Controlled 
experimentation  with  the  actual  communications  protocol  on  the  intended  hardware  offers  much  insight 
into  the  behavior  of  the  protocol  under  various  conditions,  facilitating  the  modeling  and  simulation  efforts 
(Brodeen,  Kaste,  and  Broome  1992). 

1.4  The  Verification.  Validation,  and  Accreditation  fW&A)  Process.  Recent  events  have  caused  the 
defense  community  (e.g.,  various  defense  and  service  science  boards,  the  General  Accounting  Office 
[GAO],  the  Defense  Modeling  and  Simulation  Office  [DMSO],  etc.)  to  refocus  considerable  attention  on 
the  W&A  process  of  the  models  and  simulations  it  uses.  A  forthcoming  Department  of  Defense  (DOD) 
directive  on  modeling  and  simulation  will  require  each  military  service  to  establish  W&A  policies, 
guidelines,  and  procedures.  The  research  outlined  in  this  paper  presents  an  enhancement  to  the  formal 
results  validation  procedure.  Results  validation  will  hereafter  refer  to  the  formal  documented  review 
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process  that  compares  responses  of  a  model  and/or  simulation  with  known  or  expected  behavior  from  the 
subject  or  system  it  represents  to  ascertain  that  the  model/simulation  responses  are  sufficiently  accurate 
for  intended  uses.  A  variety  of  methods  may  be  employed  in  results  validation:  comparison  with  expert 
expectation  (i.e.,  high  face  validation),  actual  test  data,  results  from  other  models,  or  historical  data 
(Sargent  1992). 

•  Objectives  and  Challenges.  Experimentation  with  a  simulation  is  only  a  surrogate  for  actually  being 
able  to  experiment  with  an  existing  or  proposed  system.  A  reasonable  goal  of  validation  is  to  ensure  that 
a  simulatimi  is  developed  that  can  actually  be  used  by  a  decision  maker  to  make  the  same  decision  that 
would  have  been  made  if  it  were  feasible  and  cost-effective  to  experiment  with  the  actual  system. 
Validation  should  enhance  the  confidence  placed  in  the  results  produced  by  the  simulation.  The  challenge 
is  to  develop  a  validation  process  that  is  at  the  same  time  feasible  yet  more  effective,  and  can  be  applied 
to  both  existing  simulations  as  well  as  newly  developed  ones. 

1.5.  Current  Research.  Simulation  and  modeling  are  widely  accepted  means  of  analyzing  real-world 
systems  that  are  too  complex  to  model  analytically.  Most  communications  networks  fall  into  this  category. 
But  model  credibility  suffers  when  a  continuing  verification  and  validation  program  is  not  undertaken, 
thereby  diluting  die  value  of  analyses  the  models  support.  It  is  not  uncommon  ivithin  a  military 
organizaticHi  to  find  several  groups  each  developing  a  network  simulation  that  performs  essentially  the 
same  tasks;  the  differences  usually  lie  in  the  model  assumptions  and/or  definitions  of  simulation  responses. 
An  independent  evaluator  is  called  upon  to  assess  the  performance  of  several  simulations  against  limited 
empirical  data.  The  product  of  this  research  will  be  to  formalize  a  multivariate  multisample  rank  sum  test 
that  will  enhance  long-term  efforts  to  standardize  the  process  of  building,  verifying,  and  validating 
corrunand,  control,  and  communications  (C3)  simulations  for  flexibly  addressing  issues  related  to  low-level 
information  distribution  on  the  banlefield.  This  research  will  also  serve  to  strengthen  the  link  between 
experimentation  and  simulation,  both  of  which  should  be  utilized  in  evaluating  communications  protocols’ 
measures  of  performance  (MOP). 

2.  PERMUTATION  TESTS 

2.1  Conditional  Nonoarametric  Hypothesis  Tests.  In  this  section,  we  consider  the  construction  of 
nonparametric  (distribution-free)  hypothesis  tests  whose  critical  regions  are  determined  from  information 
gained  from  observed  data.  The  critical  region  is  thus  conditional,  since  it  can  be  created  only  after  the 
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data  have  been  observed.  Nonetheless,  the  test  procedure  has  overall  significance  level  a  because  the 
critical  region  is  constructed  to  assure  the  conditional  probability  of  rejecting  a  valid  null  hypothesis  Hq 
remains  a  Conditional  hypothesis  tests  are  discussed  at  several  levels  of  theoretical  intensity,  ranging 
from  Cbnover  (1971),  Noreen  (1989),  Randles  and  Wolfe  (1979),  and  Edgington  (1987)  to  Puri  and  Sen 
(1993).  Our  ultimate  interest  lies  in  hypothesis  testing  in  a  multivariate  multisample  frameworic;  but  to 
fix  ideas  and.  to  some  extent,  notation,  we  begin  with  consideration  of  a  two-sample  univariate  location 
problem. 

2.2  General  Setting  for  Rank  Statistics.  Let  Xj,  ....  and  Yj . be  independent 

random  samples  from  continuous  distributions  with  cumulative  distribution  functions  (c.d.f.)  F(x)  and 
F(x  -6),  re^ctively,  where  -oo  <  6  <  oo,  and  define 

Zj  ■  Xj ,  i  ®  1 . m 

•Yj.^.  i=m  +  l . N.  (1) 

with  N  >  m  n .  Let  ^  ^  denote  the  combined  sample  order  statistics  and 


the  vector  of  order  statistics.  If  the  distributions  of  the  random  variables  X  and  Y  are  identical  (i.e., 
5  «  0).  then  every  arrangement  of  the  X’s  and  Y’s  in  the  ordered  combined  sample  should  be  equally 
likely.  This  is  the  basic  principle  underlying  many  nonparametric  procedures  based  on  ranks,  and  is 
estaUished  formally  as  Theorem  2.3. 

2.3  Theorem  2.3.  Let  Zj ,  .  .  .  ,  Zj^  be  a  random  sample  from  a  continuous  distribution, 
and  let  R  ■  (Rj,  ....  R^j  be  the  corresponding  vector  of  ranks  (i.e.,  Zj  « 

i  «  1 ,  .  .  .  .  N).  If  R  is  the  permutation  group  of  the  integers  1 . N,  then  R  is  distributed 

uniformly  over  R  (Randles  and  Wolfe  1979). 
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•  Proof.  The  pennutation  group  /?  has  N !  elements.  It  will  suffice  to  show  that  R  assumes  each  of 

the  permutations  of  ( 1 .....  N)  with  probability  1  /  N ! .  Let  r  =  (rj . )  €  /?  be  an  arbitrary 

element  of  R.  Then 

. Zn)-(Z(,,) . Z(.,)) 

.P(Z,,<...<Z,^).  (3) 

where  dj  is  the  location  of  the  number  i  in  the  permutation  r.  for  i  »  1 . N.  But  Zj . 

are  iiKlependent  and  identically  distributed  (i.i.d.)  random  variables,  therefore. 

P(R.r).P(Zj,<--<Z^).p(z,<---<ZN)  =  P(R=r„)  (4) 

where  rQ*(l.....N).  But  r  is  arbitrary,  and  the  cardinality  of  R  is  N!.  hence. 

P(R  «  r)  -  1/N!  .  (5) 

This  completes  the  proof. 

2.4.  Theorem  2.4.  Let  Zj.  .  .  .  .  Zj^  be  i.i.d.  continuous  random  variables,  and  let 
R»  (R|,  .  .  .  ,  Rp^j  denote  the  rank  vector  of  these  observations;  that  is,  Rj  is  the  rank  of  Zj  among 

Zj,  .  .  .  ,  Zpj.  Let  Z(i)  <  •  •  •  <  Z(N)  be  the  order  statistics  of  Zj,  .  .  .  ,  Zp,  .  R  and 

Z^j)  <  •  •  •  <  Z(f4j  are  independent  (Randles  and  Wolfe  1979). 

•  Proof.  It  will  suffice  to  show  that  the  conditional  distribution  of  Z^jj  <  •  •  •  <  given 

R  »  r,  is  equal  to  the  marginal  distribution  of  Z^  jj  <  •  •  •  <  Z(pj),  for  arbitrary  re/?.  Consider 
r*r  ■(!,.. .,N).  Then  Z^j^  *Zj,...,  Z^|^^  *  Zj^  and 
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(6) 


g(z(i).....Z(N)|R-r*)-  n  f(z(ij)/p[R  =(1 . N)] 

i  ■  1 

"  /  I 

-  N!  n  f , 

!■  1 

which  is  the  joint  unconditional  distribution  of  ^(i),  .  •  .  .  This  completes  the  proof. 

2.5  Permutation  Principle.  With  the  aid  of  Theorems  2.3  and  2.4,  the  expression 

Po[2l  ■  ^(n)*  •••’ “  2(rN)  I  ^ (7) 

is  established.  This  equation  is  the  mathematical  statement  of  the  permutation  principle,  and  it  provides 
the  basis  for  construction  of  conditional  distribution-free  tests  of  hypotheses.  Conditioning  on  the  order 
statistics  vector  reflects  that  the  data  have  been  observed.  Under  the  assertion  of  identical 

distributions  {HqI  S  >  oj,  the  population  labels  are  suppressed,  and  every  arrangement  of  the  data  is 
equally  likely.  The  transformation  from  the  permutation  principle  (equation  7)  to  the  mechanics  of 
hypothesis  test  construction  is  best  conveyed  by  example. 

2.6  Two-Samole  Univariate  Location  Problem.  Taylor  (1992)  presents  two  sets  of  measurements 
made  on  spin  rates  of  long-rod  penetrators  corresponding  to  two  distinct  fin  configurations,  where 
Xj»97J,  X2  ■  122.2,  X3  >  108.2,  and  y|»78.1,  y2“76.7,  y3  «  88.5  are  the  observed  values 
of  random  samples  of  sizes  m  ■  n  ■  3  from  continuous  distributions  with  c.d.f.’s  F(x)  and  F(x  -  5), 
respectively.  The  observed  order  statistics  of  the  combined  sample  are  then 
z^jj  ■  76.7,  z^2)  “  78.1,  z^3j  ■  88.5,  z^^^  “  97.5,  z^jj  ■  108.2,  and  z^gj  =  122.2.  Since 
there  is  no  constraim  as  to  which  of  the  two  fm  configurations  might  provide  the  larger  mean,  a  two-tailed 
test  is  rqjfnopriate.  To  construct  a  conditiorud  test  that  is  a  distribution-free  permutation  test  of 
Hg:  5  ■  0  against  the  alternative  5^0,  we  desire  a  statistic,  s[Xj,X2,X3; 

that  is  a  measure  of  5.  In  this  example,  we  select  s|X],X2,X3;  Y],Y2,Y3j  «  Y  -  X  with 

_  1  3  _  3  ^  \ 

X  ■  _  Xj  and  Y  ■  _  Yj.  Next,  we  compute  the  C  «  «  20  possible  values  of 

3  i-i  3  j-l 

Y  -  X  corre^nding  to  the  ways  in  which  we  can  assign  three  of  the  ordered  values  z^j^,  .  .  .  ,  z^g^ 
to  be  designated  as  x’s.  The  resulting  values  of  S,  denoted  by  Sj ,  .  .  .  ,  S20,  are  given  in  Table  1.  The 
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associated  conditional  null  distribution  of  Y  -  X,  given  «  z^.^,  then  assigns  probability  1/20  to 
each  of  the  Sj  values  in  Table  1.  Since  both  small  and  large  values  of  Y  -  X  are  indicative  of  the 
alternative  H^:  5  3^  0.  the  critical  regions  for  the  corresponding  permutation  test  would  contain  an 
aj^rropriate  number  of  the  largest  |  S;  |  values.  The  proportion  of  the  data  permutations  with  as  large 
a  value  of  |  y  -  x  |  as  28.200  is  chosen.  Thus,  an  a  «  .10  («  2/20)  level  critical  region  would  be 
C  10  *  {  28J200,  -28.200  ] .  This  is  the  smallest  level  at  which  this  permutation  test  would  reject 
Hq:  5  *  0  in  favor  of  Hj;  5^0.  The  data  for  this  example  corresponds  to  S20  ®  -28.200. 

2.7  Multivariate  Extension.  The  permutation  principle,  which  has  thus  far  been  restricted  to  a 
univariate  two-sample  setting,  can  be  extended  and  applied  to  a  wide  variety  of  statistical  problems.  The 
construction  of  conditional  tests  finds  t^plication  in  fundamental  considerations  of  multivariate  analysis 
where  counting  and  ranking  techniques  do  not  lend  themselves  effectively  to  small  sample  situations. 
Puri  and  Sen  (1993)  {xovide  a  rigorous  treatment  of  the  use  of  conditional  tests  in  dealing  with  problems 
in  multivariate  data  analysis.  The  approach  in  the  following  sections  corresponds  in  the  main  to  their 
develc^ertt. 

2.8  Theoretical  DevelotMnent.  Let 

Xf-(xfj . Xjj),j-I. - n,,,  k  =  l . c  (8) 

be  independent  p-dimensional  random  variables  from  c  continuous  distributions  with  the  c.d.f.  of 
denoted  by  F^(x),  k  -  1,  .  .  .  ,  c.  The  data  ^cture  is  that  of  a  multivariate  multisample 
( 2  S  c )  location  proUem;  i.e., 

Fk(x)  -  f{x  -  \),  k  -  1 . c  ,  (9) 

and  the  interest  is  in  testing  H^:  S]  «  *  *  *  «  5^  against  the  alternative  5,  *  5,  for  some  r  ^  s. 
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Table  1.  Pennuiation  Values  of  y  -  i 


Values  Assigned  as  Xj 's 

Values  of  y  -  X 

76.7,  78.1,  88.5 

s  1  =  28.200 

76.7,  78.1,  97.5 

S2  =  22.200 

76.7,  78.1,  108.2 

S3  =  15.067 

76.7,  78.1,  122.2 

S4  =  5.733 

76.7,  88.5,  97.5 

S5  =  15.266 

76.7,  88.5,  108.2 

Sg  =  8.133 

76.7,  88.5,  122.2 

S7  =  -1.200 

76.7,  97.5,  108.2 

Sg  =  2.133 

76.7,  97.5,  122.2 

Sg  =  -7.200 

76.7,  108.2,  122.2 

Sjo  =  -14.334 

78.1,  88.5,  973 

Sjj  =  14.334 

78.1,  88.5,  108.2 

S12  =  7.200 

78.1,  88.5,  122.2 

Si3  =  -2.133 

78.1,  97.5,  108.2 

Si4  =  1.200 

78.1,  97.5,  122.2 

Si5  =  -8.133 

78.1,  108.2,  122.2 

S16  =  -15.266 

88.5,  973.  108.2 

s  j'y  “  ■"5.733 

88.5,  97.5,  122.2 

s  ig  *  “15.067 

88.5,  108.2,  122.2 

Sjg  =  -22.200 

97.5,  108.2,  122.2 

S20  =  -28.200 

2.9  Matrix  Representation.  The  combined  sample  of  these  data  is  naturally  represented  as  a  matrix 
of  observations  of  the  form 


It 


In, 


X 


(10) 
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where  X  is  a  p  X  N  matrix  in  which  the  columns  are  the  vector-valued  observations;  i.e.. 


X 


(11) 


2.10  Data  Analysis.  In  the  constniction  of  distribution-hree  procedures,  the  data  are  often  replaced 
by  their  ranks.  This  transformation  may  be  attractive  for  a  number  of  reasons,  a  principal  one  being  the 
distribution  of  a  test  statistic  need  be  established  only  once.  Otherwise,  a  customized  test  (such  as  given 
in  section  2.6)  must  be  constructed  for  every  set  of  data.  The  customized  test  in  which  the  data 
themselves  serve  as  scores  was  an  idea  originally  advanced  by  Fisher  (1935) — rank  tests  and  tests  based 
on  rank  scores  are  descendants.  The  development  to  follow,  consistent  with  Puri  and  Sen  (1993),  will  use 
the  rank  representation,  but  with  the  understanding  that  different  score  functions  remain  a  viable  alternative 
approach. 


2.11  Rank  Matrix  Representation.  If  the  N  observations  on  the  i***  variate  X^, 
j  >  1,  .  .  .  ,  n^,  k  «  1,  .  .  .  ,  c.  are  arranged  in  ascending  order,  and  Ry  denotes  the  rank 
of  Xjj,  the  observation  matrix  X  gives  rise  to  a  corresponding  matrix  of  ranks 


R 


R 


11 


^Pl 


In, 


P"! 


‘In, 


Pnc 


(12) 


Each  row  of  this  matrix  is  a  random  permutation  of  the  integers  1 ,  .  .  .  .  N,  and  thus  R  is  a  random 
matrix  that  can  have  (N!)!*  possible  realizations. 

2.12  Multivariate  Analogue  of  the  Permutation  Principle.  Since  the  p  variates  Xjj,  i  *  1, ....  p, 
are.  in  general,  stochastically  dependent,  the  joint  distribution  of  the  elements  of  R  will  depend  on  the 
underlying  distribution  F,  even  when  the  null  hypothesis  of  identical  distributions  is  valid.  However,  when 
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F|(x)  ■  •  •  •  *  F<.(x),  the  vectors  X*^.  j  ■  1,  .  .  .  ,  tij.,  k  «  1,  .  .  .  ,  c.  are  i.i.d.  and  the  joint 
distribution  remains  invariant  under  any  permutation  of  the  vectors  among  themselves,  lliis  means  the 
condititmal  distribution  of  R  over  the  set  of  N!  configurations  generated  by  peimutations  of  the  columns 
of  R,  denoted  by  S  (R),  will  be  uniform;  i.e., 

Po(R  -  r  I  S(R))  -  I/N!  VreS(R).  (13) 

This  is  the  multivariate  analogue  of  the  pennutation  principle  (section  2.4.1)  restricted  to  rank 
representation. 


2.13  Average  Rank  Scores.  Under  Hq,  all  the  observations  Xj  , 
j  >  1 . n^.  k  «  1 . c.  have  the  same  distribution.  Consequently,  for  each  variate  i. 

the  mean  of  the  ranks  assigned  to  the  k'*'  sample 

(14) 

should  be  close  in  value  to  the  overall  mean  E|,  where 


"k 

•It  ,  1  V' 

i  IT  ^ 


Ei 


4 


k-l 


(15) 


(The  expression  for  E  j  is  unnecessarily  cumbersome  for  this  application,  since  E  j  is  simply  the  mean  of 
the  integers  1 ,  .  .  .  ,  N;  therefore,  E  j  -  (N^  1)  ^  i  .  j  ^  ^  p.  It  is  written  in  this  form  to  allow 

for  subsequoit  inclusion  of  scores  a  ( 1 ) , .  .  .  ,  a  (N)  other  than  ranks.) 


2.14  Rank  Order  Test  Statistic.  A  test  for  Hq  based  on  the  contrasts  between  the  mean  scores 
Ic  » 

Tj  is  intuitively  iqq)ealing.  The  set  of  p(c-l)  contrasts  Tj  -Ej,  i*l,...,p, 
k  «  1 .  .  .  .  ,  c ,  should,  under  Hq,  be  numerically  small  stochastically.  For  a  global  assessment 
of  Hq,  a  test  based  tm  a  function  of  the  contrasts  that  would  be  sensitive  to  the  numerical  largeness 
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Ic 

of  any  contrast  seems  appropriate.  A  positive  definite  quadratic  form  in  T  ^  -  E  j  will  accommodate 
this.  Puri  and  Sen  (1993)  advance  as  a  test  statistic 

-  E  "k  [(t'' -  i)  V-^(R)  -  e)  ]  06) 

k-1 

wheieT^  ■  |tJ^ . Tpj  and  E  ■  (Ej,  ....  Ep).  is  a  weighted  sum  of  c  quadratic  fonns 

in(T*'-E),k*l,...,c,  with  a  common  discriminant  V  (R). 

2.15  The  Discriminant  The  discriminant  V  (R)  is  the  inverse  of  the  covariance  matrix  of  the 

If 

vanates  Tj .  There  remains  to  determine  the  elements  of  this  matrix.  Under  the  null  hypothesis  of  a 

If 

common  distribution,  T-  is  distributed  as  the  mean  of  n^  integers  selected  at  random,  without 
replacement,  from  the  integers  1 .....  N.  The  expected  value  of  t|^  is  then 

Eq  (t  j)  -  2^  tJ  (Conover  1971)  .  (17) 

The  notation  Eq  denotes  expectation  under  Hq.  To  determine  the  covariance  of  the  variates 
k  k 

T } ,  T  •>,  i,  i'  «  1 , . . . ,  p,  the  following  result  is  useful. 

•  Lemma.  Let  X  | ,  .  .  .  ,  and  Y  Y„  be  random  samples  (not  necessarily 

independent),  and  let  X  -  JL  S  Xj.  Y  -  1  Z  Y.,  then  E  [x  y]  «  E  [xy]. 

m  n  ■* 

•  Proof.  The  result  is  established  by  the  following: 

e[xy].1x;  E(XiY).E(XY).i£  (xYj) 

-E[xy].  (18) 

This  completes  the  proof. 
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2.16  The  Covariance.  The  expected  value  of  the  product  of  the  sample  means  is  then  (from 

section  2.15.1)  the  expectation  of  the  product  of  the  variates  whose  realizations  provide  the  summands. 

This  observation  is  importartt.  since  it  permits  estimation  of  the  covariance  of 
k  k 

Tj ,  Tj.,  i,  i'  -  1,  ....  p,  by  the  expression 


Covo 


,  C  "k 

4  E  E 

M  k-1  i-l 


R7: 


Rn  -  Ei 


Ei' 


(19) 


analogous  to  the  relation  Cov(x,y)  ■  E  [x  y]  -  E  [x]  E  [y]. 


2.17  The  Quadratic  FOrm.  The  quadratic  form  is  attractive  in  that  the  correlation  structure 

between  the  variates  i  «  1 . p  .  is  taken  into  account  through  the  covariance  matrix  V(R). 

Scaling  of  the  variates  was  simultaneously  accomplished  by  assignment  of  tanks. 

2.18  Atmlication.  This  methodology  will  be  applied  to  a  communications  network  simulation 
validation  in  section  4. 

3.  TACTICAL  NETWORK  EXPERIMENT  AND  SIMULATION  DEVELOPMENT 

3.1  Background.  A  controlled  laboratory  experiment  was  conducted  at  the  U.S.  Army  Research 
Laboratory’s  (ARL)  Command.  Control,  Corrununications,  and  Computers  (C4)  Research  Facility  during 
the  summer  of  1991  to  evaluate  the  performance  of  a  tactical  communications  protocol  over  combat  net 
radios  (CNR)  (Kaste,  Brodeen,  and  Broome  1992).  The  !q>prDach  was  to  quantify  the  effects  of  message 
arrival  rate  and  message  length  on  the  throughput  and  delay  of  a  small  combat  radio  net.  The  results 
provided  statistically  sound  baseline  information  to  be  used  as  input  for  network  simulations,  partial 
guidelines  for  designing  network  architectures  and  communications  protocols,  and  for  future  experiments 
on  combat  radio  nets.  Eventually  a  small  commurtications  network  simulation  was  developed  utilizing 
the  OPNET  simulation  tool,  duplicating  the  configuration  of  the  aforementioned  experiment. 


3.2  Exrterimental  Design  Factors  of  Interest.  The  two  factors  tested  in  the  experiment  were  message 
arrival  rate  and  message  length.  Four  levels  of  message  arrival  rate  were  tested  with  each  of  4  levels  of 
message  length  (i.e.,  a  full-factorial  design),  yielding  16  test  combinations.  The  levels  for  message  arrival 
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rate  were  100,  250,  350,  and  500  messages  per  node.  The  levels  for  message  length  were  48,  144,  256, 
and  352  characters. 

•  Design  Matrix.  It  was  decided  the  shortest  reasonable  time  to  test  any  1  of  the  16  combinations 
was  1  hr.  Since  the  testing  of  all  16  combinations  required  a  minimum  of  16  hr  for  a  single  replication, 
which  realistically  could  not  be  completed  in  1  day.  a  randomized  incomplete  block  design  was 
constructed  in  order  that  day-to-day  variability  would  not  influence  the  results.  The  16  combinations  were 
divided  into  Mocks  of  size  4,  and  the  4  blocks  were  run  over  a  4-day  period.  The  assignment  of  the 
combinatitMis  into  blocks  was  based  on  a  confounding  scheme.  This  scheme,  in  which  a  different  set  of 
duee  of  the  9  degrees  of  freedom  for  the  interaction  term  was  completely  confounded  within  each 
rq)lication,  assured  the  effects  of  message  arrival  rate  and  message  length,  as  well  as  their  interaction,  on 
network  throughput  and  netwotic  delay  could  be  measured.  Three  replications  of  the  design  matrix  were 
made  to  ensure  the  incomplete  block  design  was  balanced,  thereby  facilitating  the  analysis,  although  part 
of  die  precision  of  the  estimate  of  the  interaction  effect  was  sacrificed  (i.e.,  the  relative  information 
available  for  the  interaction  term  was  two-thirds). 

3.3  Experimental  and  Simulation  Configurations.  The  experiment  consisted  of  four  nodes,  each  of 
which  was  a  SUN  wort  station,  communicating  over  a  tactical  network.  Each  contained  a  message  driver, 
providing  communications  loading,  and  data  collection  software  to  log  the  sending  and  receipt  of  messages 
and  acknowledgments  as  well  as  information  on  queues,  as  depicted  in  Figure  1.  The  nodes  were 
connected  to  modems  to  enable  communications  via  radios  that  could  communicate  in  single-channel  (SC) 
or  in  frequency-hopping  (FH)  mode.  It  was  decided  to  simulate  only  the  SC  capability.  The  modems 
allowed  communication  using  a  specified  tactical  net-sensing  algorithm  and  communications  protocol. 
To  minimize  error  rates,  the  radios  were  placed  no  more  than  3  ft  apart  and  were,  therefore,  set  to  low 
power.  Resistor  loads  were  used  in  place  of  antermas  to  avoid  interference.  The  analogous  four-node 
simulation  configuiation  utilizing  the  OPNET  tool  is  represented  in  Figure  2.  Figure  3  depicts  the 
structure  of  an  individual  tactical  node.  Each  node  has  three  processor  modules,  a  queue  module  that 
performs  the  bulk  of  tiie  channel-access  processing,  and  a  pair  of  radio  receiver  and  transmitter  modules. 

•  The  Server  Model  The  four  message  arrival  rates  emulated  the  rate  of  actual  user-generated 
messages  and  specific  nodes’  ability  to  respond  to  incoming  messages.  For  the  experiment,  the  arrival 
rate,  X,  represented  the  number  of  messages  generated  during  a  1-hr  test  cell  and  queued  for  transmission 
on  the  net,  not  the  number  of  messages  actually  transmitted  during  the  hour.  A  message  was  assumed  to 
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Figure  1.  Hardware  configuration. 

enter  network  service  when  it  reached  the  modem,  as  depicted  by  the  area  inside  the  dashed  line  in 
Figure  4.  Thus,  the  server  was  considered  a  combination  of  modem  and  combat  radio  network.  The 
queue  was  the  area  outside  the  dashed  line.  A  scenario  generator  was  written  to  create  "messages”  of 
character  strings  of  a  specified  length  and  arrival  rate  over  a  l*hr  period.  The  simulation,  then, 
had  to  accommodate  varying  message  lengths  and  arrival  rates.  Once  the  message  was 
generated,  the  corrununications  protocol  added  several  layers  of  information  to  ensure  the  message  arrived 
at  its  destination.  This  included  five  error  correction/detection  bits  for  each  seven-bit  character,  four 
syndironization  characters,  and  a  preamble  to  bring  the  transmitter  to  full  power  before  the  message  was 
sent  Ackrxrwledgments.  though  shorter  in  length,  were  wrapped  with  similar  overhead  bits.  In  the 
experiment  the  numbers  of  messages  generated  for  transmission  each  hour  by  each  node  were  assumed 
to  be  mutually  independent  Poisson-distributed  random  variables  with  parameter  The  messages  were 
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Figure  3.  The  node  model. 
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Figure  4,  The  server  model. 

equally  distributed  among  the  four  nodes.  For  example,  if  the  arrival  rate  was  2,000  messages/hr,  the 
scenario  generator  created  a  file  of  500  messages  for  each  node.  Each  of  these  details  was  represented 
in  the  simulation  Thus,  the  simulation  r^resented  an  actual  system  using  a  real  and  less-than-trivial 
prMocoL  In  the  simulaticm,  the  media  access  control  process  model  (Figure  S)  manages  the  transmissimi 
and  reception  of  messages.  The  tasics  are  decomposed  into  three  basic  functions:  enciqisulating  and 
queuing  outgoing  messages,  deciqisulating  and  delivering  incoming  messages,  and  managing  an  ongoing 
transmission. 
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Figure  5.  Media  access  control  process  model. 


4.  APPUCATION  TO  NETWORK  SIMULATION  VALIDATION 


4.1  Foreword.  Simulations  must  be  subjected  to  an  ongoing  validation  process  before  inferences 
obtained  from  them  about  the  "real  world"  can  be  used  with  confidence.  The  permutation  test 
methodology  described  in  section  2  can  be  utilized  to  test  the  validity  of  a  multivariate  response  simulation 
with  respect  to  its  mean  behavior.  In  this  section,  the  validation  test  will  be  applied  to  a  small-scale 
simulation  of  a  limited  bandwidth  communications  network  developed  by  ARL. 

4.2  Reformulation  of  the  Location  Problem.  Recall  that  we  wish  to  test  the  identity  of 

c  2)  multivariate  continuous  distributions  Fj,  .  .  .  .  ,  based  on  independent  random  samples 

from  each.  Let 

. i-' . k-' . <=  ™ 

be  such  a  set  of  independent  vector-valued  random  samples,  where  the  c.d.f.  of  is  denoted  by  F^(x). 
We  wish  to  verify  diat 


Hq  :  Fi(x)  «  •  •  ♦  -  Fc(x)  -  F(x)  V  x  (21) 

wdiete  F  (x)  is  a  common  p-variate  c.d.f.  We  wish  to  test  (see  equation  21)  against  a  location  parameter- 
type  alternative,  that  is,  vs. 


;  F^(x)  ■  f|x  -  5^1  for  k  »  1,  .  .  .,  c,  and  some  5^  ^  0  ,  (22) 

or  equivalently 

Ho  :  5i  - - 5^-0  (23) 

against  the  alternative  that  Sp  .  .  ..  5(.  are  not  all  equal.  Since  only  a  shift  in  location  is  being 
considered,  homogeneity  of  the  scale  vectors  of  Fx>  .  .  ..  F^  is  assumed. 

•  Special  Case.  We  consider  the  special  case  of  comparing  two  systems  (i.e.,  "real  world"  and 
simulated)  on  the  basis  of  several  carefully  selected  performance  measures.  We  effect  this  comparison 
by  determining  whether  Fj(x)  and  F2(x)  differ  in  location.  The  data  consist  of  two  independent 
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vector-valued  random  samples.  The  k‘‘'  random  sample  is  of  size  n^^,  where  k  -  1,2.  Denote  the 
empirical  observations  as  Xj ,  j  «  1,  2,  3  ;  denote  the  simulated  observations  as  Xj.  j  = 

The  total  number  of  observations  isN  =  nj  +n2  =  3+  7  =  10.  There  are  no  missing  observations 
nor  tied  values  to  consider. 

4.3  MOP.  Although  data  for  a  number  of  MOP  were  collected  during  the  experiment,  comparisons 
between  empirical  and  simulation  results  were  limited  to  network  throughput,  network  delay,  and 
utilization.  These  were  the  only  measures  that  could  be  defined  by  continuous  random  variables. 

Netwoik  throughput  is  the  average  number  of  information  bits  that  were  successfully 
transmitted  and  acknowledged  over  a  1-hr  test  cell.  Throughput  does  not  include  such 
overhead  as  the  acknowledgments  themselves  or,  in  the  event  of  collisions,  message 
retransmissions.  It  does,  however,  include  error  detection/correction  bits  and 
synchronization  characters. 

Network  delay  is  the  average  time  that  passes  between  a  message’s  arrival  at  a  host’s 
modem  until  the  acknowledgment  returns  to  the  host.  Messages  that  were  never 
completely  serviced  during  the  running  of  a  test  cell  were  not  considered  in  computing 
netwoilr  delay. 

Netwoik  utilization  for  a  particular  time  interval  is  the  amount  of  time  spent  actually 
transmitting  messages,  message  retransmissions,  or  acknowledgments  during  that  interval, 
divided  by  the  amount  of  time  in  the  interval.  Messages,  retransmissions,  and 
acknowledgments  include  the  preamble  and  other  protocol  overhead  in  addition  to  actual 
transmission  bits. 

4.4  Case  Selectioa  While  16  combinations  of  message  arrival  rate  and  message  length  were  tested  in 
the  1991  experiment,  only  8  were  chosen  for  the  validation  study.  The  8  combinations  were  not  chosen 
in  a  purely  random  fashion  as  it  was  desirable  to  ensure  the  simulation  would  be  evaluated  at  the 
2  extremes  of  both  parameter  ranges  (i.e..  arrival  rate  of  400  messages  and  message  length  of  48 
characters;  arrival  rate  of  2,000  messages  and  message  length  of  352  characters).  One  component  of  the 
highest-older  interaction  was  confounded  such  that  the  16  combinations  were  divided  into  2  blocks  of 
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8  units  each.  The  principal  block  was  selected  as  it  contained  the  two  extreme  conditions  mentioned 
previously.  Given  the  data  were  not  all  collected  under  the  same  conditions,  each  combination  was  treated 
as  a  homogeneous  grouping;  therefore,  each  served  as  an  independent  case  to  test  the  null  hypothesis. 

•  Observations.  The  appropriate  empirical  observations  were  taken  from  each  of  the  three  replications 
performed  for  the  1991  experiment.  The  simulation  was  not  run  with  the  scenarios  generated  for  the 
experimental  test  cells  to  ensure  the  independence  of  the  sample  observations.  The  capability  to  utilize 
actual  message  scenarios  as  simulation  input  does,  however,  afford  the  developer  a  useful  tool  for 
verification. 


4.5  Testing  the  Null  Hvix)thesis.  For  these  data,  the  number  of  permutations  possible  is  N !  and  the 


number  of  distinct  values  of  the  Lf^  statistic  possible  is 


N! 


nj!  02! 


nc! 


N! 


..  Since  this 


c 

n 

k  =  l 


n  n^! 


validation  study  deals  with  small  values  of  N  ( «  10)  and  p  ( s  3),  the  statistic  may  be  calculated 


for  all  permutations  of  the  observations  that  lead  to  distinct  values  of  the  statistic;  thus,  an  exact 
apiAication  of  the  permutation  test  is  possible.  We  have  elected  to  replace  the  ranks  R- :  by  a  rank  score 


function  of  the  form  thereby  reducing  t|^  to  (N  +  1  )~*  times  the  a  erage  rank  of  the  k* 

sample,  i^  variate  observations  among  the  combined  sample  i*^  variate  observations.  (The  motivation  for 
this  choice  was  the  univariate  case  (p  >  1 )  for  wluch  the  statistic  reduces  to  the  Kruskal-Wallis  test.) 


This  test  was  ^lied  to  an  OPNET  communications  network  simulation  in  order  to  validate  simulated 
ouqnit  with  respect  to  empirical  observations.  The  hypothesis  (equation  23)  was  tested  for  the  MOP 
outlined  in  section  4.3.  The  significance  or  P-value  is  the  proportion  of  the  10!/3!7!  =  120  data 
permutations  providing  an  equivalent  or  larger  test  statistic  than  that  obtained  for  the  reference,  or 
observed,  set  Assuming  an  a  priori  significance  level  of  0.05,  the  null  hypothesis  was  rejected  in  five 
of  the  eight  test  combinations.  The  observed  test  statistic  values  and  the  resultant  P-levels  are  summarized 
in  Table  2. 
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Table  2.  Multivariate  Multisample  Rank  Sum  Test  Results 


Hq:  6i  ■  •  •  •  *  5c  =  0 

Ir^t  Condidon  Observed  Stadsdc  P- Value  Reject/Fail  to  Reject 

(messages,  characters) 

400,  48 

7.184996 

0.04167 

Reject 

400,  256 

9.032534 

0.00833 

Rejea 

1,000,  144 

6.970858 

0.05833 

Fail  to  Reject 

1,000,  352 

9.651814 

0.00833 

Reject 

1,400,  48 

7.826177 

0.02500 

Reject 

1,400,  256 

6.581197 

0.10000 

Fail  to  Reject 

2,000,  144 

6.734517 

0.07500 

Fail  to  Reject 

2,000,  352 

9.210527 

0.00833 

Reject 

4.6  Remaiks.  We  note  that  the  testing  procedure  and.  hence,  the  conclusion  reached,  depends  on  the 
measure  of  location  shift  that  is  employed.  Alternative  measures  could  be  employed;  however,  use  of  a 
differait  estimator  for  5  will  likely  produce  a  different  conditional  rejection  region.  Since  we  are  free  to 
choose  the  test  statistic,  an  alternate  statistic  suggested  by  Chung  and  Fraser  (19S8)  was  considered. 

4.7  Chung  and  Fraser  Test  Statistic.  The  theoretical  contributions  of  Chung  and  Fraser,  while 
substantial,  have  generally  gone  unnoticed.  They  proposed  several  randomization  tests  for  the  multivariate 
two-sample  problem  that  were  initially  developed  for  the  normal-theory  two-sample  problem  for  which 
die  Hotelling  test  does  not  exist;  however,  the  tests  are  also  valid  in  a  more  general  context  as 
iKH^iarametric  tests.  The  approach  of  Chung  and  Fraser  is  to  select  a  statistic  suitable  for  the  univariate 
case,  apply  it  to  eadi  of  the  p  variates,  and  add  the  resulting  expressions.  This  approach  does  not  take 
into  account  covariances,  as  is  required  with  the  nonparametric  counterpan  of  the  Hotelling’s  T^  statistic. 
For  measuring  shift  in  location  alternatives,  a  two-sample  rank  test  may  be  obtained  by  recording  ranks 
aixi  using  the  absolute  value  of  the  difference  in  sample  means  as  a  test  statistic.  One  of  the  forms  of  the 

p 

Chung  and  Fraser  statistic,  and  the  one  considered,  is  ^  |  Tj  -  Sj  | ,  appealing  in  its  simplicity  (Chung 

i-l 

and  Fraser  1958). 
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•  Results.  The  observed  test  statistic  values  and  the  resultant  P-ievels  for  Chung  and  Fraser’s  rank 
test  are  summarized  in  Table  3.  Employing  the  test  statistic  from  section  4.7,  the  null  hypothesis  was 
rejected  for  six  of  the  eight  test  conditions.  The  results  were  intuitively  appealing  to  the  simulation 
developer  who  had  established  validity  on  the  sole  basis  of  visual  inspections  of  the  data  sets.  The 
conditional  rejection  region  differs  from  the  one  established  in  section  4.5.  Indeed,  the  same  conclusion 
regarding  the  simulation’s  validity  was  established  for  only  three  of  the  eight  test  conditions.  However. 
Figures  6-8  suggest  a  correlation  stmcture  among  the  three  variates.  This  was  expected  from  theoretical 
considerations  of  the  communications  network.  This  structure  is  not  accounted  for  by  the  Chung  and 
Fraser  statistic. 


Table  3.  Chung  and  Fraser  Test  Statistic  Results 


Hq:  8j  »  •  •  •  *  5g  *  0 

Irqiut  Condition  Observed  Statistic  P-Value  Reject/Fail  to  Rejea 

(messages,  characters) 

400,  48 

8.809525 

0.08333 

Fail  to  Rejea 

400,  256 

7.857143 

0.12500 

Fail  to  Rejea 

1,000,  144 

12.619050 

0.00833 

Rejea 

1,000,  352 

15.000000 

0.00833 

Rejea 

1,400,  48 

13.571430 

0.00833 

Rejea 

1,400,  256 

15.000000 

0.01667 

Rejea 

2,000,  144 

12.619050 

0.03333 

Reject 

2,000,  352 

15.000000 

0.00833 

Reject 

5.  PROJECTION  OF  CURRENT  RESEARCH 

Future  Considerations.  As  with  any  well-defined  research  initiative,  additional  areas  warranting 
investigation  have  surfaced,  either  due  to  apparent  anomalies  in  the  performance  of  the  diosen  test  statistic 
or  simple  curiosity.  These  areas,  however,  lie  outside  the  realm  of  this  study,  but  perhaps  should  become 
the  focus  of  future  studies.  Some  possibilities  include  the  following: 
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Conununications  data  for  400  messages,  48  characters 
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Communications  data  for  400  messages,  48  characters 


•  Sensitivity  of  tlie  Multivariate  Multisample  Test  Statistic.  We  saw  in  Tables  2  and  3  different 
decisions  made  for  the  statistic  advanced  by  Puri  and  Sen  (section  2.14)  and  that  of  Chung  and  Frasier 
(section  4.7).  Puri  and  Sen’s  statistic  Lff,  whose  developmem  was  considerably  more  intense,  attemixs 

to  take  into  account  correlation  structure.  Chung  and  Frasier’s  statistic,  which  is  more  direct,  does  not 
Curiously,  the  decisions  associated  with  the  Chung  and  Frasier  statistic  may  hold  more  visual  iq^real  (i.e., 
face  validity).  Of  course,  tiie  "curse  of  dimensiorudity"  is  ever  present,  com{dicating  visual  assessment 
Reconciliation  of  the  contents  of  Tables  2  and  3  is  a  natural  consequence  of  this  observation. 


•  Combining  Independent  Tests.  In  this  validatitxi  study,  the  same  null  hypotiiesis  was  tested  for 
several  sets  of  independent  samples,  not  all  necessarily  gathered  under  the  same  conditions,  thereby 
generating  several  sets  of  statistics  by  which  to  judge  the  validity  of  the  communications  network 
simulation.  Generally  ^reakirtg,  the  military  simulation  and  modeling  community  prefers  a  single  statistic 
reflecting  the  usability  of  any  specific  simulation.  Given  this  situation,  a  possible  approach  is  to  combine 
the  various  results  into  a  single  statistic  on  which  an  objective  overall  judgment  can  be  based.  For  tiiis, 
we  might  begin  by  considering  a  technique  for  combining  two-sample  tests  proposed  by  van  Elteren 
(1960). 
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•  Alternative  Test  Statistics.  Computer-intensive  methods  may  be  applied  to  a  variety  of  hypothesis 
testing  situations.  Keeping  in  mind  that  we  consider  these  methods  as  the  means  by  which  to  generate 
the  probability  distribution  of  some  statistic  under  the  "null  hypothesis  is  true  assumption."  we  are  free 
to  select  and  customize  a  test  statistic  on  the  basis  of  its  sensitivity  towards  an  alternative.  We  may  wish 
to  consider  odier  test  statistics  that  measure  a  difference  of  location. 

•  Generalization  of  Chung  and  Fraser  Statistic.  In  section  4.7.1,  we  saw  that  the  Chung  and  Fraser 
randomization  test  for  the  multivariate  two-sample  problem  performed  well  when  considered  as 
an  objective  counterpart  to  visual  inspectioa  In  fact,  it  agreed  in  every  case.  In  their  paper,  Chung  and 
Fraser  (1958)  state  their  methods  are  easily  extended  to  the  k-sample  problem.  It  might  be  worthwhile 
to  pursue  extension  of  Chung  and  Fraser’s  woik  to  the  multisample  problem. 

6.  SUMMARY 

As  reliance  upon  computer  simulations  to  model  processes  that  resist  analytical  description  increases, 
so  does  the  need  to  validate  the  simulations  themselves.  An  impartial  approach  to  simulation  validation 
is  through  statistical  hypothesis  testing.  In  this  paper,  an  application  of  a  nonparametric  multivariate 
procedure  to  assess  the  validity  of  a  communications  network  simulation  model,  whose  intent  is  to  emulate 
a  limited  bandwidth  combat  radio  net,  is  detailed.  The  procedure,  sometimes  described  as  a  permutation 
or  randomization  test,  offers  considerable  flexibility  to  the  analyst  charged  with  maintaining  the  fidelity 
of  the  modeling  effort. 
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